Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce default checkpoint distance flag from 40 to 20 to speedup EN startup #2223

Merged
merged 1 commit into from
Mar 29, 2022

Conversation

fxamacker
Copy link
Member

@fxamacker fxamacker commented Mar 29, 2022

Reduce checkpoint distance flag to trigger creation of checkpoints more frequently.

Prior to #1944 checkpoint creation took 11-15+ hours so it wasn't feasible to run checkpoints more frequently.

This change will cause fewer WAL segments to trigger checkpoint creation, which will speedup EN startup by having fewer WALs to replay during startup.

On benchnet, replaying 20 WALs is 3.5 minutes faster than replaying 40 WALs.

This change will also speedup checkpoint file creation by the same duration because that also requires replaying WALs.

closes #2207
updates https://github.com/dapperlabs/flow-go/issues/6114

Reduce checkpoint distance flag to trigger creation
of checkpoints more frequently.

This change will cause fewer WAL segments to trigger
checkpoint creation, which will speedup EN startup
by having fewer WALs to replay during startup.

On benchnet-dev-004, replaying 20 WALs was 3.5 minutes
faster than replaying 40 WALs.

This change will also speedup checkpoint file creation by
the same amount because that also requires replaying WALs.
@fxamacker fxamacker requested a review from zhangchiqing March 29, 2022 14:16
@fxamacker fxamacker self-assigned this Mar 29, 2022
@@ -145,7 +145,7 @@ func main() {
flags.StringVar(&triedir, "triedir", datadir, "directory to store the execution State")
flags.StringVar(&executionDataDir, "execution-data-dir", filepath.Join(homedir, ".flow", "execution_data_blobstore"), "directory to use for Execution Data blobstore")
flags.Uint32Var(&mTrieCacheSize, "mtrie-cache-size", 500, "cache size for MTrie")
flags.UintVar(&checkpointDistance, "checkpoint-distance", 40, "number of WAL segments between checkpoints")
flags.UintVar(&checkpointDistance, "checkpoint-distance", 20, "number of WAL segments between checkpoints")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just maybe we should use some const rather than magic number

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! I'll open an issue/PR to replace magic numbers in the flags. E.g.

image

@fxamacker fxamacker merged commit 6faf64a into master Mar 29, 2022
@fxamacker fxamacker deleted the fxamacker/change-checkpoint-distance-to-20 branch March 29, 2022 16:51
@fxamacker fxamacker added the Execution Cadence Execution Team label Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Execution Cadence Execution Team Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Execution Node] EN startup time can be further reduced by reducing number of WAL segments
3 participants